gRPC: initial stable semconv support and testing#16304
gRPC: initial stable semconv support and testing#16304trask merged 10 commits intoopen-telemetry:mainfrom
Conversation
2c1a363 to
0f28710
Compare
206f9e9 to
e4b6d0d
Compare
zeitlinger
left a comment
There was a problem hiding this comment.
Thanks for the PR!
One issue I noticed: the stable gRPC semconv spec specifies that rpc.response.status_code should use the named status code (OK, CANCELLED, UNAVAILABLE, etc.), not the numeric value (0, 2, 14).
Suggest changing:
attributes.put(RPC_RESPONSE_STATUS_CODE, String.valueOf(status.getCode().value()));to:
attributes.put(RPC_RESPONSE_STATUS_CODE, status.getCode().name());(and updating the test assertions accordingly)
zeitlinger
left a comment
There was a problem hiding this comment.
In rpc/dup mode, span attributes can only hold one value for rpc.method. Since stable semconv wins (line 42-53 of RpcCommonAttributesExtractor), old metrics would silently get the stable method value (example.Greeter/SayHello) instead of the old one (SayHello).
The dualEmitContextCustomizer from #16298 was designed to fix this — it stashes the old rpc.method in context so RpcClientMetrics/RpcServerMetrics can swap it back for old metrics. But it's not wired into GrpcTelemetryBuilder.
Suggest adding to GrpcTelemetryBuilder.build(), after addOperationMetrics for each instrumenter:
clientInstrumenterBuilder
.addOperationMetrics(RpcClientMetrics.get())
.addContextCustomizer(
RpcMetricsContextCustomizers.dualEmitContextCustomizer(rpcAttributesGetter));serverInstrumenterBuilder
.addOperationMetrics(RpcServerMetrics.get())
.addContextCustomizer(
RpcMetricsContextCustomizers.dualEmitContextCustomizer(rpcAttributesGetter));(The customizer is a no-op when not in dup mode, so it's safe to always register.)
zeitlinger
left a comment
There was a problem hiding this comment.
Follow-up on the dualEmitContextCustomizer comment: the reason this wasn't caught is that there's no rpc/dup integration test for gRPC. The testBothSemconv task in instrumentation-api-incubator exercises RpcClientMetrics/RpcServerMetrics in isolation (manually calling the context customizer), but nothing tests the full gRPC pipeline in dup mode.
Adding a testBothSemconv task to instrumentation/grpc-1.6/javaagent/build.gradle.kts with -Dotel.semconv-stability.opt-in=rpc/dup would catch this: the test assertions would see old metrics with rpc.method=example.Greeter/SayHello instead of rpc.method=SayHello and fail, revealing the missing context customizer wiring.
zeitlinger
left a comment
There was a problem hiding this comment.
Two more things I noticed by comparing with #15879:
1. Captured request metadata uses old attribute key prefix in stable mode
GrpcAttributesExtractor.onEnd() always stores captured request metadata under rpc.grpc.request.metadata.<key>, even when stable semconv is active. The stable semconv spec defines the attribute as rpc.request.metadata.<key> (no grpc. infix). The big PR handles this via CapturedGrpcMetadataUtil.stableRequestAttributeKey() and branching on semconv mode — that's missing here.
2. Missing testStableSemconv task for grpc-1.6/library
The testStableSemconv Gradle task is only added to grpc-1.6/javaagent/build.gradle.kts, but the big PR also adds one to grpc-1.6/library/build.gradle.kts. The library tests exercise GrpcTelemetryBuilder.build() directly, which is the code path where things like the dualEmitContextCustomizer wiring would be caught.
zeitlinger
left a comment
There was a problem hiding this comment.
Done reviewing. Summary of findings:
rpc.response.status_codeshould use named status codes (status.getCode().name()) not numeric (String.valueOf(status.getCode().value())) per the stable gRPC semconv specdualEmitContextCustomizerfrom #16298 is not wired intoGrpcTelemetryBuilder, sorpc/dupmode would produce wrongrpc.methodvalues on old metrics- A
testBothSemconvGradle task withrpc/dupwould have caught the above - Captured request metadata always uses old
rpc.grpc.request.metadata.*prefix — stable semconv requiresrpc.request.metadata.*(missingCapturedGrpcMetadataUtil.stableRequestAttributeKey()) - Missing
testStableSemconvtask forgrpc-1.6/librarymodule (only javaagent has one)
Compared against the original big PR #15879 to identify gaps. Items 1 and 2 are also present in the big PR, so they're pre-existing rather than regressions from the split.
Per the stable gRPC semconv spec, rpc.response.status_code should use the named status code (OK, CANCELLED, UNAVAILABLE, etc.) not the numeric value (0, 2, 14). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Fix missed status code assertions in streaming and clientErrorThrown tests Use Status.Code.name() instead of String.valueOf(Status.Code.*.value()) for rpc.response.status_code assertions in AbstractGrpcStreamingTest and the clientErrorThrown test in AbstractGrpcTest. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In rpc/dup mode, span attributes can only hold one value for rpc.method. The dualEmitContextCustomizer stashes the old rpc.method value in context so RpcClientMetrics/RpcServerMetrics can use it for old metrics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Suppress deprecation warning for RpcMetricsContextCustomizers in build() Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tests the full gRPC pipeline in rpc/dup mode to catch issues like missing context customizer wiring that would produce wrong rpc.method values on old metrics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The stable semconv spec uses rpc.request.metadata.<key> instead of rpc.grpc.request.metadata.<key>. Add stableRequestAttributeKey() to CapturedGrpcMetadataUtil and use it when stable semconv is active. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The library tests exercise GrpcTelemetryBuilder.build() directly, which is the code path where issues like missing context customizer wiring would be caught. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Fix RPC_SYSTEM assertions for rpc/dup semconv mode in tests Replace equalTo(maybeStable(RPC_SYSTEM), "grpc") with dual RPC_SYSTEM and RPC_SYSTEM_NAME assertions in AbstractGrpcTest and AbstractGrpcStreamingTest, to correctly handle rpc/dup mode where both attributes are emitted. Also removes the now-unused maybeStable import from both test files. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- ArmeriaGrpcTest: use Status.Code.OK.name() instead of String.valueOf(Status.Code.OK.value()) for rpc.response.status_code, matching the GrpcAttributesExtractor implementation - grpc-1.6 build.gradle.kts: add testLatestDeps system property and IPv4 preference to testStableSemconv and testBothSemconv tasks, matching the existing test task configuration Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| } | ||
|
|
||
| /** Returns a new {@link GrpcTelemetry} with the settings of this {@link GrpcTelemetryBuilder}. */ | ||
| @SuppressWarnings("deprecation") // RpcMetricsContextCustomizers is deprecated for removal in 3.0 |
There was a problem hiding this comment.
the classes used for transition could just be internal classes
There was a problem hiding this comment.
I kind of like to make them public to support anyone outside of this repo going through the transition, although maybe that's not too realistic since they'd have to stick to our timeline as we'll be removing them in 3.0
No description provided.